Twelve Ways of Looking at the Foreign-Born Population of the United States

by Lilian Huang

In [0]:
!pip install geopandas

from google.colab import drive
import pandas as pd
import altair as alt
from vega_datasets import data
import numpy as np
import json
import geopandas as gpd
import shapely
from shapely.ops import polylabel
from shapely.geometry import Point
pd.set_option('display.max_rows', None)

drive.mount('/content/gdrive')
In [0]:
def class_theme():
    main_palette = ["#385ed4","#55b748","#db2b27","#b589da","#b75f31","#1696d2","#fdbf11","#ff1ae4"]
    sequential_palette = ["#cfe8f3","#aaecff","#a2d4ec","#73bfe2","#46abdb","#1696d2","#12719e","#0a4c6a","#062635"]
    return {
    "config": {
            "title": {
                "font": "Futura",
                "fontSize": 18,
                "anchor": "middle",
                "color": "darkblue",
            },
         "axisX": {
                "grid": False,
                "tickSize": 6,
             "labelFontSize": 10,
             "titleFontSize": 12,
         },
        "axisY": {
                "labelFontSize": 10,
                "tickSize": 6,
            "titleFontSize": 12,
         },
        "background": "white",
        "text": {
               "color": "#686863",
               "fontSize": 10,
               "fontWeight": 400,
            "baseline": "top",
            "filled": True,
            "lineBreak": "\n",
           },
            "bar": {
                "fill": "#1696d2"
            },
            "line": {
               "strokeWidth": 3,
           },
        "range": {
                "category": main_palette,
                "diverging": "blueorange",
            "ramp": sequential_palette,
            "ordinal": sequential_palette,
            "heatmap": sequential_palette
            },
        "legend": {
                "titleFontSize": 12
            }
    },
    }

alt.themes.register("my_custom_theme", class_theme)
alt.themes.enable("my_custom_theme")

A first look

We begin with a broad perspective on the foreign-born population as a whole, and how its size has varied across time and space.

Variation across time

In [0]:
newpop_df = pd.read_csv("/content/gdrive/My Drive/datavizdata/processed_data/histandcurpopn.csv")

annotations = {
    1880: "Rise of steamship travel",
    1924: "Immigration Act of 1924 \nestablishes country quotas",
    1965: "Hart-Celler Act of 1965 \nabolishes country quotas"
}
annot_df = pd.DataFrame.from_dict(annotations, orient='index').reset_index()
annot_df.columns = ['Year', 'Annotation']
In [60]:
foreignline1 = alt.Chart(newpop_df[(newpop_df['Year'] <= 2017) & (newpop_df['variable'] == 'Total Foreign-Born Population')]).mark_area(
    color="#385ED4",
    line=True
).encode(
    x=alt.X('Year', axis=alt.Axis(format='d', title='Year')),
    y=alt.Y('value', title="Population")
).properties(
    width=700, height=300, title={
    "text": "US foreign-born population keeps growing with minor speedbumps",
    "subtitle": ["This shows the foreign-born population's actual growth until 2017, and its projected growth from 2018 onwards.",
                 "Source: actual figures from the Pew Research Center and Census Bureau, population projections from the Census Bureau"],
    "subtitleColor": "#686863"
  })

foreignline2 = alt.Chart(newpop_df[(newpop_df['Year'] >= 2017) & (newpop_df['variable'] == 'Total Foreign-Born Population')]).mark_area(color="#385ED4", line=True, fillOpacity=0.3).encode(
    x=alt.X('Year', axis=alt.Axis(format='d', title='Year')),
    y=alt.Y('value', title="Population")
).properties(
    width=700, height=300)

timelines = alt.Chart(annot_df).mark_rule(color='black', strokeWidth=1, strokeDash=[5,5]).encode(x='Year')

timetext = alt.Chart(annot_df).mark_text(
    dy = -100, dx=3, align='left'
).encode(
    x="Year",
    text='Annotation'
)

alt.layer(foreignline1, foreignline2, timelines, timetext)
Out[60]:

Data source: Pew Research Center population figures, Census Bureau population projections, Census Bureau historical data

Throughout most of the history of the United States, the foreign-born population has been growing, other than a dip from the 1920s to 1960s due to the imposition of immigration quotas. Ever since those quotas were abolished by the Immigration and Nationality Act of 1965 (also called the Hart-Celler Act), the size of the foreign-born population has been steadily climbing. It grew faster in the 1970s than in the 1960s, and even faster in the 1990s. The absolute size of the foreign-born population is still growing, and by Census Bureau calculations, it is projected to keep growing until 2060. This may initially seem to lend credence to claims that the immigrant population is poised to become the dominant majority in the United States.

In [61]:
totalline = alt.Chart(newpop_df).mark_area(opacity=0,
    line=True
).encode(
    x=alt.X('Year', axis=alt.Axis(format='d', title='Year')),
      y=alt.Y("value", stack=None, title="Population"),
    color=alt.Color("variable", legend=alt.Legend(title=""))
).properties(
    width=650, height=300,
    title={
    "text": "Reports of immigrant takeover are greatly exaggerated",
    "subtitle": ["This shows the actual growth of both the native and foreign-born populations until 2017, and their respective projected growth from 2018 onwards.",
                 "The red line indicates the cutoff between actual figures and projected figures.",
                 "Source: actual figures from the Pew Research Center and Census Bureau, population projections from the Census Bureau"],
    "subtitleColor": "#686863"    
  })

overlay = pd.DataFrame({'Year': [2017]})
vline = alt.Chart(overlay).mark_rule(color='red', strokeWidth=1, strokeDash=[6,2]).encode(x='Year')

alt.layer(
    totalline, vline)
Out[61]:

Data source: Pew Research Center population figures, Census Bureau population projections, Census Bureau historical data

However, when placing this trend in the context of the growth of the total population, we see that the native population of the United States has been growing at a similar, and in fact slightly greater, pace than the foreign-born population, and this is projected to continue into the future (although 2017 is the last year for which actual data is available at present). It is also clear that the foreign-born population still constitutes a relatively small proportion of the total population of the United States (hovering around one-sixth both in current data and future projections).

Nonetheless, it is clear that the foreign-born population is indeed growing, and therefore it is worthwhile investigating where this growth is most pronounced.

In [0]:
change_url = 'https://raw.githubusercontent.com/lilianhj/datavizdata/master/percentchangebystate.csv'
change_df = pd.read_csv(change_url)

change_df['Percent Change: 1990 to 2017'] = (change_df['Percent Change: 1990 to 2017']/100)
In [63]:
base = alt.Chart(change_df).mark_bar().encode(
    x=alt.X('State', sort=['Division'], title=''),
    y=alt.Y('Percent Change: 1990 to 2017', title="Percentage Change", axis=alt.Axis(format='%'))
).properties(
    width=170,
    height=170
)

meanline = alt.Chart(change_df).mark_rule(color="black", strokeWidth=3).encode(
    y='mean(Percent Change: 1990 to 2017)'
)

chart = alt.hconcat().properties(title={
    "text": "Regional change in foreign-born population from 1990 to 2017",
    "subtitle": ["This shows the percentage change in the size of each state's foreign-born population from 1990 to 2017.", 
                 "States are grouped by their Census-defined geographic region, with the gold line indicating the mean change for that region. The black line indicates the mean change nationwide.",
                 "Source: Migration Policy Institute"],
    "subtitleColor": "#686863"
  })
for reg in change_df['Region'].unique():
    chart |= (base.transform_filter(alt.datum.Region == reg).properties(title={
    "text": reg, "fontSize": 14}) + meanline + meanline.transform_filter(alt.datum.Region == reg).mark_rule(color="#fdbf11", strokeWidth=2, strokeDash=[6,2]))
chart
Out[63]:

Data source: Migration Policy Institute

We can see that the growth varies geographically; overall, foreign-born population growth (from 1990 to 2017) was highest in the South, with those states having the highest average percentage change - an approximately 300% increase in foreign-born population from 1990 to 2017. The Northeast had the lowest average percentage change, an increase of under 100% in foreign-born population during that same timeframe. The South shows the widest range of variation, partly as a result of the Census's broad definition of what constitutes "South".

We can also see that there is sizeable variation even within the same Census-defined geographic region; one possibly surprising observation is that, in the West, California underwent significantly less percentage change than Nevada. This is due to the fact that California started out with a larger foreign-born population in 1990 in the first place; while its increase from 1990 to 2017 was greater than Nevada's in terms of absolute numbers, it was, proportionally, significantly smaller.

Given the evident variation among geographic regions, our next step is to examine the spatial distribution of the foreign-born population in further detail.

Variation across space

In [0]:
geojsonfile = 'gdrive/My Drive/datavizdata/us-states.geojson'
with open(geojsonfile) as json_data:
    d = json.load(json_data)
gdf = gpd.GeoDataFrame.from_features((d))

us_centroids = pd.read_csv("/content/gdrive/My Drive/datavizdata/usa_centroids.csv")
gdf_with_cent = gdf.merge(us_centroids[['name', 'centrepoint_lon', 'centrepoint_lat']], on='name', how='inner')

foreign_born_df = pd.read_csv("/content/gdrive/My Drive/datavizdata/processed_data/foreignbornpopn.csv")

f_bins = [0, 50000, 100000, 200000, 300000, 500000, 1000000, 3000000, 10000000, 50000000]
f_labels = ["50,000 or less", "50,001-100,000", "100,001-200,000", "200,001-300,000", "300,001-500,000", "500,001-1,000,000", "1,000,001-3,000,000", "3,000,001-10,000,000", "Above 10,000,000"]
foreign_born_df['bin'] = pd.cut(foreign_born_df['Total Foreign-born Population'], bins=f_bins, labels=f_labels)

foreign_map = gdf_with_cent.merge(foreign_born_df, left_on='name', right_on='State', how='inner')

foreign_json = json.loads(foreign_map.to_json())
foreign_data = alt.Data(values=foreign_json['features'])
In [65]:
foreign_chart = alt.Chart(foreign_data).mark_geoshape(stroke='white'
    ).encode(
        alt.Color('properties["bin"]', 
                  type='ordinal',
                  scale = alt.Scale(domain=f_labels, type='ordinal'),
                  title = "Population")
    ).project(type='albersUsa').properties(width=650, height=350, title={
    "text": "Foreign-born population of each state",
    "subtitle": ["This shows the number of foreign-born residents in each state in 2016.",
                 "Source: American Community Survey"],
    "subtitleColor": "#686863"
  })

foreign_labels = alt.Chart(foreign_data).mark_text(baseline='top', color="red"
     ).encode(
         longitude='properties.centrepoint_lon:Q',
         latitude='properties.centrepoint_lat:Q',
         text='properties.postal:O',
         size=alt.value(8),
         opacity=alt.value(1)
     )

foreign_chart + foreign_labels
Out[65]:

Data source: American Community Survey estimates of foreign-born residents (table B05013 for 2016)

This map allows us to more clearly visualize the geographical distribution of the foreign-born population across the United States. California has an overwhelmingly larger foreign-born population than other states, but besides California, we can see that Texas, Florida, Illinois, New York, New Jersey, and Massachusetts also rank among the states with the largest foreign-born populations.

"Unauthorized" immigration

When discussing the foreign-born population, one key question is how they enter the country, and whether they are authorized to reside there.

Unauthorized immigrants are defined by the Migration Policy Institute as foreign-born residents who are not (naturalized) citizens, not lawful permanent residents (green card holders), not refugees/asylees, and do not legally hold temporary visas.

In [0]:
unauth_url = 'https://raw.githubusercontent.com/lilianhj/datavizdata/master/unauthorizedpopn.csv'
unauth_df = pd.read_csv(unauth_url)

u_bins = [0, 25000, 50000, 75000, 100000, 200000, 300000, 500000, 1000000, 5000000]
u_labels = ["25,000 or less", "25,001-50,000", "50,001-75,000", "75,001-100,000", "100,001-200,000", "200,001-300,000", "300,001-500,000", "500,001-1,000,000", "Above 1,000,000"]
unauth_df['bin'] = pd.cut(unauth_df['Total Unauthorized Population'], bins=u_bins, labels=u_labels)

unauth_df['Unauth_Per_Foreign_Capita'] = unauth_df['Total Unauthorized Population'] / unauth_df['Total Foreign-born Population']

unauth_df['surprise'] = unauth_df['Unauth_Per_Foreign_Capita'] - unauth_df['Unauth_Per_Foreign_Capita'].mean()

unauth_map = gdf_with_cent.merge(unauth_df, left_on='name', right_on='State', how='inner')

unauth_json = json.loads(unauth_map.to_json())
unauth_data = alt.Data(values=unauth_json['features'])
In [67]:
us_bgmap = alt.Chart(foreign_data).mark_geoshape(fill="lightgrey", stroke='white'
    ).project(type='albersUsa').properties(width=650, height=350)

unauth_chart = alt.Chart(unauth_data).mark_geoshape(stroke='white'
    ).encode(
        alt.Color('properties["bin"]', 
                  type='ordinal',
                  scale = alt.Scale(domain=u_labels, type='ordinal'),
                  title = "Population")
    ).project(type='albersUsa').properties(width=650, height=350, title={
    "text": "Unauthorized population of each state",
    "subtitle": ["This shows the number of unauthorized immigrants in each state in 2016.",
                 "(Data is not available for North Dakota and Vermont.)",
                 "Source: Migration Policy Institute"],
    "subtitleColor": "#686863"
  })

unauth_labels = alt.Chart(unauth_data).mark_text(baseline='top', color="red"
     ).encode(
         longitude='properties.centrepoint_lon:Q',
         latitude='properties.centrepoint_lat:Q',
         text='properties.postal:O',
         size=alt.value(8),
         opacity=alt.value(1)
     )

us_bgmap + unauth_chart + unauth_labels
Out[67]:

Data source: Migration Policy Institute estimates of unauthorized immigrants

This map reflects the geographical distribution of such immigrants across the United States. We see that while there is a large population of unauthorized immigrants in California, the differential with other states is less pronounced than for the general foreign-born population, especially for Texas, Florida, and New York.

In juxtaposition with the previous map, we can observe how the number of unauthorized immigrants in a state correlates with the total number of foreign-born residents in that state (in 2016). We see that the patterns roughly correspond - a higher total number of foreign-born residents correlates with a higher number of unauthorized immigrants in that state. We can see that California has by far the largest foreign-born population as well as the largest number of unauthorized immigrants. However, from this alone, we cannot discern any conclusive patterns as to whether certain states have unusually high proportions of unauthorized immigrants.

In [68]:
unauth_per_cap = alt.Chart(unauth_data).mark_geoshape(stroke='white'
    ).encode(
        alt.Color('properties["Unauth_Per_Foreign_Capita"]', 
                  type='quantitative',
                  title = "Proportion")
    ).project(type='albersUsa').properties(width=650, height=350, title={
    "text": "Proportion of state's foreign-born population that is unauthorized",
    "subtitle": ["This shows the proportion of each state's foreign-born population that consists of unauthorized immigrants, as of 2016.",
                 "(Data is not available for North Dakota and Vermont.)",
                 "Source: Migration Policy Institute"],
    "subtitleColor": "#686863"
  })
    
us_bgmap + unauth_per_cap + unauth_labels
Out[68]:

Data source: Migration Policy Institute estimates of unauthorized immigrants, American Community Survey estimates of foreign-born residents (table B05013 for 2016)

This map shows the proportion of each state's foreign-born population that consists of unauthorized immigrants (as previously defined) - in other words, the unauthorized rate among the foreign-born population.

Arkansas's foreign-born population and unauthorized population are both not especially large in absolute numbers. However, it has the highest proportion of foreign-born residents who are unauthorized, with over 40% of its foreign-born population being unauthorized; other nearby states such as North Carolina and Tennessee are close behind. Many undocumented immigrants settle here, to pursue employment in the manufacturing (specifically meat processing) or construction industries.

In [69]:
surprise_map = alt.Chart(unauth_data).mark_geoshape(stroke='white'
    ).encode(
        alt.Color('properties["surprise"]', 
                  type='quantitative',
                  title = "Deviation", 
                  scale = alt.Scale(scheme="blueorange"))
    ).project(type='albersUsa').properties(width=650, height=350, title={
    "text": "Deviation of state's unauthorized rate from nationwide average",
    "subtitle": ["This shows how much the state's proportion of foreign-born residents who are unauthorized deviates from the nationwide average."],
    "subtitleColor": "#686863"
  })

us_bgmap + surprise_map + unauth_labels
Out[69]:

Data source: calculated from Migration Policy Institute estimates of unauthorized immigrants, American Community Survey estimates of foreign-born residents (table B05013 for 2016)

This map highlights the significance of these findings. It shows the difference between a state's rate of unauthorized immigrants (as aforementioned, the proportion of its foreign-born population that is unauthorized) and the nationwide average rate.

We can again see that Arkansas and North Carolina have unusually high unauthorized rates compared to other states, while states such as Indiana and Kentucky hew to the national average. Moreover, Florida has fewer unauthorized immigrants than might be expected, given the size of its overall foreign-born population; this is not visible simply from looking at the maps of absolute population numbers, as Florida ranks among the top states both in terms of unauthorized population size and foreign-born population size.

Authorized immigration: Refugees

Having looked at the overall growth of the foreign-born population and the subset of that population which is not legally authorized to reside in the United States, we now consider one of the possible paths for legal immigration - being admitted as a refugee.

Where do they come from?

In [0]:
origins_url = 'https://raw.githubusercontent.com/lilianhj/datavizdata/master/originregionsbyyear.csv'
origins_df = pd.read_csv(origins_url)

annotations_refugees = {
    1999: "Kosovo War",
    2002: "Post-9/11 refugee freeze"
}
annot_refugees_df = pd.DataFrame.from_dict(annotations_refugees, orient='index').reset_index()
annot_refugees_df.columns = ['Year', 'Annotation']
In [71]:
barchart = alt.Chart(origins_df).mark_bar().encode(
    x='Year:O',
    y=alt.Y('sum(value)', title="Number of Admissions"),
    color='Origin'
).properties(
    height=300,
        title={
    "text": "Annual refugee admissions by region of origin",
    "subtitle": ["This shows the number of refugee admissions per year, classified by the nationality of the principal applicant.",
                 "Source: Refugee Processing Center"],
    "subtitleColor": "#686863"
  }
)

yrline = alt.Chart(annot_refugees_df).mark_rule(color='black', strokeWidth=1, strokeDash=[5,5]).encode(x='Year:O')

thetext = alt.Chart(annot_refugees_df).mark_text(
    dy = -100, dx=3, align='left'
).encode(
    x="Year:O",
    text='Annotation'
)

barchart + yrline + thetext
Out[71]:

Data source: Refugee Processing Center

Data from the Department of State's Refugee Processing Center shows that the number of annual refugee admissions has overall decreased from a high in 1992; there was a steep dip for 2002 and 2003, presumably due to the admissions freeze and increased security in refugee screening after 9/11. There was an overall increasing trend from 2004 to 2016, but from 2017 onwards, the annual number of admissions has dropped sharply, possibly due to the change in presidency.

In addition, the distribution of nationalities admitted as refugees has shifted quite drastically over time; in the 1990s, refugees admitted predominantly hailed from Asia and the former Soviet Union, with an upswing in refugees from other European nations towards the end of the decade - notably a large influx of refugees from the Kosovo War in 1999. From the 2000s onwards, refugees from Africa, and the Near East and South Asia, have gained prominence in the mix. (From 2004 onwards, there are zero admissions recorded from the former Soviet Union, which suggests that the Department of State may have shifted its classification to begin recording these as European admissions - which is supported by the jump in European admissions for 2004 through 2006.)

(The Private Sector Initiative was implemented by President Ronald Reagan in 1986, and discontinued under the Clinton presidency; refugees who were admitted under this initiative were sponsored by private funders, and were thus recorded separately from other refugees, and not associated in this data with their nation of origin. However, researchers note that these refugees were primarily Cubans and Soviet Jews.)

In [0]:
monthly_df = pd.read_csv("/content/gdrive/My Drive/datavizdata/processed_data/refugeeadmissionsbyregionbymonth2019.csv")
In [73]:
alt.Chart(monthly_df).mark_line().encode(
    x=alt.X('Month', sort=['October', 'November', 'December', 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September']),
    y=alt.Y("value:Q", title="Number of Admissions"),
    color=alt.Color('Country:N', legend=alt.Legend(title="Region of Origin"))
).properties(
    height=200,
    width=500,
        title={
    "text": "Refugee admissions throughout Fiscal Year 2019",
    "subtitle": ["This shows the number of refugee admissions in each month of FY 2019, separated by the applicant's region of origin.",
                 "Source: Refugee Processing Center"],
    "subtitleColor": "#686863"
  }
)
Out[73]:

Data source: Refugee Processing Center

It is also worthwhile examining how this influx of admitted refugees varies over the course of a year, to gain a better sense of how the flow varies not only spatially but also temporally.

We see that during Fiscal Year 2019, most refugees admitted every month (except September) were from Africa, but each region had a different peak during the year. The flow of refugees admitted from Africa was at its peak in May, and remained especially high during summer, while the flow from Europe peaked in August and September (with an influx of refugees from Ukraine), and the flow from the Near East and South Asia peaked in July. Overall, we see high volume of inflow during the summer months.

In [0]:
africa_df = pd.read_csv("/content/gdrive/My Drive/datavizdata/processed_data/africarefugees2019.csv")
In [75]:
alt.Chart(africa_df).mark_bar().encode(
    x=alt.X('Country', sort='-y'), 
    y=alt.Y('sum(value)', title="Number of Admissions")).properties(height=300, title={
    "text": "Refugee admissions from Africa in Fiscal Year 2019",
    "subtitle": ["This shows the total number of refugee admissions from each African country in FY 2019.",
                 "Source: Refugee Processing Center"],
    "subtitleColor": "#686863"
  })
Out[75]:

Data source: Refugee Processing Center

The influx of African refugees seems large enough to warrant a closer look. We see that the vast majority of them hail from the Democratic Republic of the Congo, fleeing armed conflict between the government and local militias, and there are a smaller number from Eritrea seeking to escape their highly repressive totalitarian government. Due to the European Union's general tightening of immigration restrictions, especially its implementation of a stringent disembarkation policy in June 2018 - under which intercepted refugees are sent to Libya and detained in harsh conditions for further processing - the United States is one of the few remaining alternatives for these refugees.

In [0]:
africajson = 'gdrive/My Drive/datavizdata/africa.json'
with open(africajson) as africa_data:
    d_africa = json.load(africa_data)
gdf_africa = gpd.GeoDataFrame.from_features((d_africa))

africa_centroids = pd.read_csv("/content/gdrive/My Drive/datavizdata/africa_centroids.csv")
gdf_africa_with_cent = gdf_africa.merge(africa_centroids[['name', 'centrepoint_lon', 'centrepoint_lat']], on='name', how='inner')

africa_to_join = africa_df.groupby('Country').sum().reset_index()
africa_to_join["in_data"] = 1

africa_map = gdf_africa_with_cent.merge(africa_to_join, left_on='name', right_on='Country', how='left')

a_bins = [0, 10, 25, 50, 100, 250, 1000, 10000, 15000]
a_labels = ["10 or less", "11-25", "26-50", "51-100", "101-250", "251-1,000", "1,001-10,000", "Above 10,000"]
africa_map['bin'] = pd.cut(africa_map['value'], bins=a_bins, labels=a_labels)

africa_choro_json = json.loads(africa_map.to_json())
africa_choro_data = alt.Data(values=africa_choro_json['features'])

africa_only_labels = africa_map[africa_map["in_data"] == 1]
africa_labels_json = json.loads(africa_only_labels.to_json())
africa_labels_data = alt.Data(values=africa_labels_json['features'])
In [0]:
africa_bgmap = alt.Chart(africa_choro_data).mark_geoshape(fill="lightgrey", stroke="white").properties(width=500, height=700,
                                                                                                     title={
    "text": "Number of refugee admissions from each African country",
    "subtitle": ["This shows the total number of refugee admissions from each African country in FY 2019.",
                 "Source: Refugee Processing Center"],
    "subtitleColor": "#686863"
  })

africalabels = alt.Chart(africa_labels_data).mark_text(baseline='middle', color="red"
     ).encode(
         longitude='properties.centrepoint_lon:Q',
         latitude='properties.centrepoint_lat:Q',
         text='properties.name:N',
         size=alt.value(8),
         opacity=alt.value(1)
     )
In [78]:
africa_admits = alt.Chart(africa_choro_data).mark_geoshape(stroke="white").encode(alt.Color('properties["bin"]', 
                  type='ordinal', scale = alt.Scale(domain=a_labels, type='ordinal'), title=""))

africa_bgmap + africa_admits + africalabels
Out[78]:

Data source: Refugee Processing Center

This map visualizes the location of these countries within the African continent, and colors them by the number of refugees admitted to the United States in FY 2019. The predominance of DRC refugees is again clear in this chart.

Where are they going?

After refugees are admitted, we can further examine their placements within the United States. The following charts visualize where refugees who were admitted in FY 2018 took up residence.

In [0]:
flow_df = pd.read_csv("/content/gdrive/My Drive/datavizdata/processed_data/forsankeyonlyregion.csv")

merged_flow_df = foreign_map[['State', "geometry", "centrepoint_lon", "centrepoint_lat", "postal"]].merge(flow_df, left_on='State', right_on='target', how='inner')

merged_flow_df = merged_flow_df.sort_values(by=['source', 'target'])

all_flow_dfs = []

for region in merged_flow_df['source'].unique():
  flow_json = json.loads(merged_flow_df[merged_flow_df['source'] == region].to_json())
  flow_data = alt.Data(values=flow_json['features'])
  all_flow_dfs.append((region, flow_data))
In [0]:
all_flow_charts = []
small_flow_charts = []

for region_data in all_flow_dfs:
  title = f"Placements for refugees from {region_data[0]}"
  base = alt.Chart(region_data[1]).mark_geoshape(
        stroke='white'
    ).encode(
        alt.Color('properties["value"]', 
                  type='quantitative', 
                  scale=alt.Scale(scheme='tealblues'),
                  title="Placements")
    ).project(type='albersUsa').properties(title={"text": title, "subtitle": ["This shows the US states where refugees originating from this region were placed in FY 2018.",
                 "Source: Worldwide Refugee Admissions Processing System"],
    "subtitleColor": "#686863"})
  labels = alt.Chart(region_data[1]).mark_text(baseline='top', color="red"
     ).encode(
         longitude='properties.centrepoint_lon:Q',
         latitude='properties.centrepoint_lat:Q',
         text='properties.postal:O',
         size=alt.value(8),
         opacity=alt.value(1)
     )
  smallbase = alt.Chart(region_data[1]).mark_geoshape(
        stroke='white'
    ).encode(
        alt.Color('properties["value"]', 
                  type='quantitative', 
                  scale=alt.Scale(scheme='tealblues'),
                  legend=None)
    ).project(type='albersUsa').properties(height=250, width=400, title={"text": region_data[0], "fontSize": 14})
  all_flow_charts.append((us_bgmap + base + labels))
  small_flow_charts.append((us_bgmap.properties(height=250, width=400) + smallbase + labels))

We first view the different groups of refugees side-by-side for ease of comparison.

In [81]:
alt.vconcat((small_flow_charts[0] | small_flow_charts[1]), (small_flow_charts[2] | small_flow_charts[3]), small_flow_charts[4]).properties(
    title={"text": "Regional variations in refugee placements",
           "subtitle": ["This shows the US states where refugees originating from each region were placed in FY 2018.",
          "Source: Worldwide Refugee Admissions Processing System"],
           "subtitleColor": "#686863"})
Out[81]:

Data source (for this and all subsequent charts): Worldwide Refugee Admissions Processing System (WRAPS)

We can observe varying patterns of placements for refugees who originate from different regions. Each group of refugees has its own particular distribution that warrants closer examination.

In [82]:
all_flow_charts[0]
Out[82]:

Refugees originating from Africa were placed all over the United States, but with especially high concentrations in Texas and Arizona.

In [83]:
all_flow_charts[1]
Out[83]:

Refugees from East Asia were also placed across many states, but especially in Texas and Minnesota.

In [84]:
all_flow_charts[2]
Out[84]:

Refugees from Europe were not dispersed as widely; the highest concentration of placements was in the state of Washington.

In [85]:
all_flow_charts[3]
Out[85]:

The number of Latin American/Caribbean refugees was much smaller compared to the numbers from other regions, but they were mainly placed in California.

In [86]:
all_flow_charts[4]
Out[86]:

Refugees from the Near East/South Asia - almost all of whom originated from Bhutan - were predominantly placed in Ohio. The city of Akron is especially receptive, and has a thriving community of Bhutanese refugees, due to the efforts of resettlement agencies such as the International Institute of Akron.

All of these explorations may only have scratched the surface of a complex topic, but they make clear that the foreign-born population's presence in the United States is substantial, and more multifaceted than many conventional perceptions.